docs(solutions): signed webhook ingress hardening patterns#699
Conversation
Documents the reusable security patterns from the announce webhook: raw-bytes HMAC signing, signed-timestamp replay window, no-oracle 401, atomic reserve/commit/release replay handling, streaming body-size cap, socket-keyed rate limiting, untrusted-payload mention hardening, and bounded outbound calls.
fro-bot
left a comment
There was a problem hiding this comment.
Verdict: PASS
Documentation-only PR. It captures the security patterns behind the POST /v1/announce webhook (#697) as a best-practices solution doc. I verified every code-bearing claim against the actual implementation in packages/gateway/src/http/ and packages/gateway/src/discord/presence.ts — the doc is accurate, and the snippets faithfully reflect shipped, already-reviewed code. No source, schema, public API, or test behavior changes in this PR, so the correctness/security/breaking-change/coverage surface is the prose itself.
Spot-checks that matched the implementation:
- Raw-bytes HMAC over
timestampHeader + "." + rawBodywithtimingSafeEqual—hmac.ts:31,51. - Identical 401 via shared
UNAUTHORIZED_BODYfor hmac/timestamp/replay —announce-handler.ts:81,135,143,151. - Atomic synchronous
reserve()before the await;commitafter success;releaseon every post-reserve early return —replay-cache.ts:104-114,announce-handler.ts:149,160,173,182,195,200. - Streaming
bodyLimitahead of the handler + content-length fast path + post-bufferbyteLengthguard —server.ts:77-98,announce-handler.ts:107. - Socket-keyed rate limit (
getConnInfo, not XFF) with boundedMAX_KEYSfail-closed —server.ts:108-109,rate-limit.ts:20,68. allowedMentions: {parse: []}on every send + non-emptyrendered_textfallback —presence.ts:89-94, andPromise.racetimeout with no-op.catch—presence.ts:104-118.
Blocking issues
None.
Non-blocking concerns
- §6 omits the absent-socket-address fallback. The doc shows
connInfo.remote.address ?? undefined, but the handler maps an absent key to a shared constant bucketDEFAULT_SOURCE_KEY = '__unknown__'(announce-handler.ts:48,104). That means all keyless requests share one rate-limit bucket — a deliberate and reasonable fail-closed choice, but a reader copying only the doc's snippet won't reproduce it. Worth one sentence so the pattern is transferable. - §4 /
replay-cache.tsreserved-entry eviction caveat is not mentioned. Reserved entries are never time-evicted (replay-cache.ts:69-92); the safety net is process restart, since §8's outbound timeout guaranteesrelease()runs. The doc's §8 implicitly covers this, but the doc presents the reserve/commit/release invariant as self-sufficient. A one-line note that "no outbound timeout ⇒ a reserved key leaks until restart" would tie §4 and §8 together explicitly and reflect the actual single-replica/in-memory constraint documented inreplay-cache.ts:16-19. - Single-replica constraint not surfaced in the doc.
replay-cache.ts:16-19andrate-limit.tsare in-memory and per-process; multi-replica deployments need a shared store or single replica. Since this is a reusable pattern doc, calling out that the replay/rate-limit guarantees are per-process would prevent a future surface from inheriting the patterns and silently losing them behind a load balancer.
Missing tests
None. No behavior changes; the underlying modules already carry tests (hmac.test.ts, replay-cache.test.ts, rate-limit.test.ts, announce-handler.test.ts, server.test.ts, presence/templates tests).
Risk assessment (LOW/MED/HIGH): LOW
Docs-only, additive (+245/-0, one new file), no executable paths touched. Zero regression or security blast radius. The non-blocking notes are about completeness/transferability of the prose, not correctness.
Run Summary
| Field | Value |
|---|---|
| Event | pull_request |
| Repository | fro-bot/agent |
| Run ID | 26672824020 |
| Cache | hit |
| Session | ses_1892481baffeA9eshaf9VSupz5 |
fro-bot
left a comment
There was a problem hiding this comment.
Verdict: PASS
Re-review after commit ed30162 (merge of main into the branch). The PR's net diff is still exactly one added file — docs/solutions/best-practices/signed-webhook-ingress-hardening-2026-05-29.md (+245/-0) — and the merge introduced no changes to the gateway HTTP implementation the doc describes (packages/gateway/src/http/ is untouched on main since my prior pass). My earlier verification that every code-bearing claim matches the shipped code therefore still holds; nothing reviewable changed. Approval stands.
Blocking issues
None.
Non-blocking concerns
Carried forward from the prior review (all optional doc-completeness notes, none affecting correctness):
- §6 omits the absent-socket-address fallback. The snippet shows
connInfo.remote.address ?? undefined, but the handler maps an absent key to a shared bucketDEFAULT_SOURCE_KEY = '__unknown__'(announce-handler.ts:48,104). All keyless requests share one rate-limit bucket — a deliberate fail-closed choice a reader copying only the snippet won't reproduce. One sentence would make the pattern transferable. - §4↔§8 reserved-entry leak link is implicit. Reserved entries are never time-evicted (
replay-cache.ts:69-92); the safety net is process restart, guaranteed only because §8's outbound timeout forcesrelease(). A one-line "no outbound timeout ⇒ reserved key leaks until restart" would tie the two sections together explicitly. - Single-replica constraint not surfaced. The replay cache and rate limiter are in-memory/per-process (
replay-cache.ts:16-19); a multi-replica deployment silently loses both guarantees. Worth stating in a reusable pattern doc.
Missing tests
None. No behavior changes; the underlying modules already carry tests (hmac.test.ts, replay-cache.test.ts, rate-limit.test.ts, announce-handler.test.ts, server.test.ts, plus presence/templates).
Risk assessment (LOW/MED/HIGH): LOW
Docs-only, additive, one new file; the only new commit is a no-op main merge that touches no executable paths. Zero regression or security blast radius.
Run Summary
| Field | Value |
|---|---|
| Event | pull_request |
| Repository | fro-bot/agent |
| Run ID | 26685069545 |
| Cache | hit |
| Session | ses_1892481baffeA9eshaf9VSupz5 |
Captures the reusable security patterns behind the announce webhook (#697) as a best-practices solution doc, so the next signed-webhook surface starts from the hardened shape instead of rediscovering it.
Patterns documented:
Cross-linked with the existing slash-command orchestration doc as the two halves of the gateway's inbound trust boundary.